Project Workflow: Census Demographics, Housing, and Geometries¶
This notebook documents the main analysis workflow for this project using preprocessed data. It shows how to load the curated datasets, perform representative cleaning and transformation steps, join demographics to geometries, and produce both static and interactive visualizations.
All file paths are relative to the project root. A classmate with this repository and the same environment can run the notebook from top to bottom and reproduce the results.
Setup Instructions¶
Before running this notebook:
Create the conda environment:
conda env create -f environment.yml
Register the Jupyter kernel:
bash setup_jupyter_kernel.shSelect the correct kernel:
- Click on the kernel name in the top right of the notebook
- Select "Python (cyplan-101-env)" from the dropdown menu
- This ensures all packages (including
fionaandgeopandas) are available
Verify data files exist:
- Check that
data/gold/directory contains the required files - See the "Data sources" section below for expected files
- Check that
Once the kernel is selected, you can run all cells from top to bottom.
import os
import sys
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import folium
# Ensure plots appear inline in the notebook
%matplotlib inline
# Set a consistent random seed for any sampling operations
RANDOM_SEED = 42
# Define base directory for project data relative to this notebook
PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))
DATA_DIR = os.path.join(PROJECT_ROOT, "data")
print("Project root:", PROJECT_ROOT)
print("Data dir:", DATA_DIR)
Project root: /Users/admin/cyplan-101-proj1 Data dir: /Users/admin/cyplan-101-proj1/data
Data sources¶
This project uses preprocessed "gold" datasets that have already gone through earlier bronze and silver pipeline stages. In this notebook we focus on:
data/gold/demographics/census_demographics_dp05.csv
Demographic indicators at the census tract level, including total population and age distribution.data/gold/demographics/census_housing_s1901.csv
Household income and related measures at the census tract level, including median household income.data/gold/geometries/census_tract_boundaries.geojson
Polygon geometries for census tracts in the study area. These include ageoid_fullidentifier used to link to tabular data.data/gold/geometries/address_points.geoparquet
Point locations representing address records. These provide context on where activity is concentrated.data/gold/geometries/street_centerlines.geoparquet
Line geometries representing street centerlines in the study area.
The goal is to connect tract-level demographic and housing data to the tract geometries and explore spatial patterns using maps and summary plots.
# Define paths to gold-layer datasets used in this notebook.
# Using relative paths tied to the project structure keeps the workflow reproducible.
demographics_path = os.path.join(DATA_DIR, "gold", "demographics", "census_demographics_dp05.csv")
housing_path = os.path.join(DATA_DIR, "gold", "demographics", "census_housing_s1901.csv")
tracts_path = os.path.join(DATA_DIR, "gold", "geometries", "census_tract_boundaries.geojson")
address_points_path = os.path.join(DATA_DIR, "gold", "geometries", "address_points.geoparquet")
streets_path = os.path.join(DATA_DIR, "gold", "geometries", "street_centerlines.geoparquet")
# Load tabular data from the gold layer. These tables contain one row per census tract.
demographics = pd.read_csv(demographics_path)
housing = pd.read_csv(housing_path)
# Load tract geometries and supporting geographic context using GeoPandas.
tracts = gpd.read_file(tracts_path)
address_points = gpd.read_parquet(address_points_path)
street_centerlines = gpd.read_parquet(streets_path)
print("Demographics rows:", len(demographics))
print("Housing rows:", len(housing))
print("Tracts rows:", len(tracts))
Demographics rows: 379 Housing rows: 379 Tracts rows: 378
# Inspect a few columns to confirm tract identifiers and key measures.
# The `geography` column contains a full census tract identifier (e.g., 1400000US06001400100).
# The `geoid_full` column on the tract geometries uses the same identifier pattern.
print("Demographics columns:\n", demographics.columns[:10])
print("Housing columns:\n", housing.columns[:10])
print("Tract columns:\n", tracts.columns)
# Show the first few rows of each dataset to ground the rest of the workflow.
demographics.head()
Demographics columns:
Index(['geography', 'est_geographic_area_name',
'est_sex_and_age_total_population', 'moe_sex_and_age_total_population',
'est_sex_and_age_total_population_male',
'moe_sex_and_age_total_population_male',
'est_sex_and_age_total_population_female',
'moe_sex_and_age_total_population_female',
'est_sex_and_age_total_population_sex_ratio_males_per_100_females',
'moe_sex_and_age_total_population_sex_ratio_males_per_100_females'],
dtype='object')
Housing columns:
Index(['geography', 'est_geographic_area_name', 'est_households_total',
'moe_households_total', 'est_households_total_less_than_dollars10000',
'moe_households_total_less_than_dollars10000',
'est_households_total_dollars10000_to_dollars14999',
'moe_households_total_dollars10000_to_dollars14999',
'est_households_total_dollars15000_to_dollars24999',
'moe_households_total_dollars15000_to_dollars24999'],
dtype='object')
Tract columns:
Index(['geoid_full', 'tract_name', 'tract_full_name', 'min_longitude',
'max_longitude', 'min_latitude', 'max_latitude', 'center_longitude',
'center_latitude', 'geometry'],
dtype='object')
| geography | est_geographic_area_name | est_sex_and_age_total_population | moe_sex_and_age_total_population | est_sex_and_age_total_population_male | moe_sex_and_age_total_population_male | est_sex_and_age_total_population_female | moe_sex_and_age_total_population_female | est_sex_and_age_total_population_sex_ratio_males_per_100_females | moe_sex_and_age_total_population_sex_ratio_males_per_100_females | ... | moe_percent_hispanic_or_latino_and_race_total_population_not_hispanic_or_latino_two_or_more_races_two_races_excluding_some_other_race_and_three_or_more_races | est_percent_total_housing_units | moe_percent_total_housing_units | est_percent_citizen_voting_age_population_citizen_18_and_over_population | moe_percent_citizen_voting_age_population_citizen_18_and_over_population | est_percent_citizen_voting_age_population_citizen_18_and_over_population_male | moe_percent_citizen_voting_age_population_citizen_18_and_over_population_male | est_percent_citizen_voting_age_population_citizen_18_and_over_population_female | moe_percent_citizen_voting_age_population_citizen_18_and_over_population_female | unnamed:_378 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1400000US06001400100 | Census Tract 4001; Alameda County; California | 3094 | 452 | 1552 | 334 | 1542 | 190 | 100.6 | 19.5 | ... | 2.4 | (X) | (X) | 2256 | (X) | 47.2 | 4.3 | 52.8 | 4.3 | NaN |
| 1 | 1400000US06001400200 | Census Tract 4002; Alameda County; California | 2093 | 224 | 1081 | 180 | 1012 | 155 | 106.8 | 24.8 | ... | 2.6 | (X) | (X) | 1636 | (X) | 51.9 | 6.1 | 48.1 | 6.1 | NaN |
| 2 | 1400000US06001400300 | Census Tract 4003; Alameda County; California | 5727 | 607 | 2818 | 487 | 2909 | 337 | 96.9 | 19.8 | ... | 4.6 | (X) | (X) | 4443 | (X) | 47.5 | 5.0 | 52.5 | 5.0 | NaN |
| 3 | 1400000US06001400400 | Census Tract 4004; Alameda County; California | 4395 | 640 | 2001 | 346 | 2394 | 419 | 83.6 | 16.3 | ... | 1.9 | (X) | (X) | 3183 | (X) | 46.5 | 5.3 | 53.5 | 5.3 | NaN |
| 4 | 1400000US06001400500 | Census Tract 4005; Alameda County; California | 3822 | 737 | 1885 | 281 | 1937 | 560 | 97.3 | 26.1 | ... | 4.7 | (X) | (X) | 3056 | (X) | 44.9 | 7.2 | 55.1 | 7.2 | NaN |
5 rows × 379 columns
# Display a small sample from the housing and tract tables.
# This helps verify that the same tract identifier appears in both tabular and spatial data.
housing.head()
| geography | est_geographic_area_name | est_households_total | moe_households_total | est_households_total_less_than_dollars10000 | moe_households_total_less_than_dollars10000 | est_households_total_dollars10000_to_dollars14999 | moe_households_total_dollars10000_to_dollars14999 | est_households_total_dollars15000_to_dollars24999 | moe_households_total_dollars15000_to_dollars24999 | ... | moe_nonfamily_households_median_income_dollars | est_nonfamily_households_mean_income_dollars | moe_nonfamily_households_mean_income_dollars | est_nonfamily_households_percent_allocated_household_income_in_the_past_12_months | moe_nonfamily_households_percent_allocated_household_income_in_the_past_12_months | est_nonfamily_households_percent_allocated_family_income_in_the_past_12_months | moe_nonfamily_households_percent_allocated_family_income_in_the_past_12_months | est_nonfamily_households_percent_allocated_nonfamily_income_in_the_past_12_months | moe_nonfamily_households_percent_allocated_nonfamily_income_in_the_past_12_months | unnamed:_130 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1400000US06001400100 | Census Tract 4001; Alameda County; California | 1316 | 142 | 6.0 | 5.8 | 0.0 | 3.2 | 1.8 | 2.9 | ... | 97096 | 225423 | 62111 | (X) | (X) | (X) | (X) | 20.6 | (X) | NaN |
| 1 | 1400000US06001400200 | Census Tract 4002; Alameda County; California | 861 | 86 | 1.4 | 1.7 | 1.7 | 1.5 | 1.6 | 1.5 | ... | 41387 | 153186 | 24879 | (X) | (X) | (X) | (X) | 31.5 | (X) | NaN |
| 2 | 1400000US06001400300 | Census Tract 4003; Alameda County; California | 2713 | 273 | 4.4 | 3.8 | 3.5 | 2.9 | 3.6 | 3.2 | ... | 18869 | 126792 | 23123 | (X) | (X) | (X) | (X) | 31.3 | (X) | NaN |
| 3 | 1400000US06001400400 | Census Tract 4004; Alameda County; California | 1803 | 173 | 0.4 | 0.8 | 0.4 | 0.7 | 7.0 | 4.2 | ... | 41855 | 145833 | 23424 | (X) | (X) | (X) | (X) | 35.4 | (X) | NaN |
| 4 | 1400000US06001400500 | Census Tract 4005; Alameda County; California | 1655 | 283 | 1.5 | 1.5 | 2.2 | 2.0 | 8.8 | 4.7 | ... | 17894 | 105792 | 17797 | (X) | (X) | (X) | (X) | 28.4 | (X) | NaN |
5 rows × 131 columns
tracts.head()
| geoid_full | tract_name | tract_full_name | min_longitude | max_longitude | min_latitude | max_latitude | center_longitude | center_latitude | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1400000US06001438201 | 4382.01 | Census Tract 4382.01 | -122.084513 | -122.071712 | 37.617563 | 37.631558 | -122.076439 | 37.626268 | POLYGON ((-122.07 37.63, -122.07 37.63, -122.0... |
| 1 | 1400000US06001440333 | 4403.33 | Census Tract 4403.33 | -122.087429 | -122.061052 | 37.596433 | 37.613476 | -122.073828 | 37.605330 | POLYGON ((-122.07 37.61, -122.07 37.61, -122.0... |
| 2 | 1400000US06001441524 | 4415.24 | Census Tract 4415.24 | -122.059295 | -122.038506 | 37.549454 | 37.569413 | -122.050693 | 37.559478 | POLYGON ((-122.05 37.57, -122.05 37.57, -122.0... |
| 3 | 1400000US06001444200 | 4442 | Census Tract 4442 | -122.055180 | -122.033659 | 37.533296 | 37.551742 | -122.045728 | 37.542543 | POLYGON ((-122.03 37.54, -122.03 37.54, -122.0... |
| 4 | 1400000US06001444304 | 4443.04 | Census Tract 4443.04 | -122.070890 | -122.005966 | 37.478609 | 37.534691 | -122.035882 | 37.508137 | POLYGON ((-122.04 37.53, -122.04 37.53, -122.0... |
Creating a shared tract identifier¶
The gold-layer tables use two different identifier fields for census tracts:
- The demographic and housing tables use a
geographystring such as1400000US06001400100. - The tract geometry file stores a
geoid_fullfield with the same pattern.
In this section we create a simplified geoid column in each table by removing the
common 1400000US prefix. This provides a clean key for joining tabular attributes
to tract geometries.
# Create a simplified geoid key shared across demographics, housing, and tracts.
# The original identifiers start with '1400000US'; removing this prefix leaves the
# underlying state + county + tract code.
prefix = "1400000US"
demographics["geoid"] = demographics["geography"].str.replace(prefix, "", regex=False)
housing["geoid"] = housing["geography"].str.replace(prefix, "", regex=False)
if "geoid_full" in tracts.columns:
tracts["geoid"] = tracts["geoid_full"].str.replace(prefix, "", regex=False)
else:
raise KeyError("Expected 'geoid_full' column in tract geometries.")
# Verify that the same geoid values appear in all three tables.
print("Unique geoids in demographics:", demographics["geoid"].nunique())
print("Unique geoids in housing:", housing["geoid"].nunique())
print("Unique geoids in tracts:", tracts["geoid"].nunique())
Unique geoids in demographics: 379 Unique geoids in housing: 379 Unique geoids in tracts: 378
# Select a focused set of demographic variables for this notebook.
# These come from the DP05 table and will be used to illustrate joins and summaries.
demo_columns = [
"geoid",
"est_sex_and_age_total_population",
"est_race_total_population_one_race_white",
"est_race_total_population_one_race_black_or_african_american",
"est_race_total_population_one_race_asian",
"est_hispanic_or_latino_and_race_total_population_hispanic_or_latino_of_any_race",
]
demographics_small = demographics[demo_columns].copy()
demographics_small.head()
| geoid | est_sex_and_age_total_population | est_race_total_population_one_race_white | est_race_total_population_one_race_black_or_african_american | est_race_total_population_one_race_asian | est_hispanic_or_latino_and_race_total_population_hispanic_or_latino_of_any_race | |
|---|---|---|---|---|---|---|
| 0 | 06001400100 | 3094 | 2246 | 137 | 462 | 200 |
| 1 | 06001400200 | 2093 | 1423 | 71 | 256 | 196 |
| 2 | 06001400300 | 5727 | 3478 | 524 | 609 | 497 |
| 3 | 06001400400 | 4395 | 2742 | 457 | 422 | 604 |
| 4 | 06001400500 | 3822 | 1773 | 919 | 312 | 557 |
# Select the tract identifier and median household income from the housing table.
# This keeps the notebook focused on one income measure that is easy to interpret.
income_column = "est_households_median_income_dollars"
housing_small = housing[["geoid", income_column]].copy()
# Convert income column to numeric, handling special cases like "250,000+"
def convert_income_to_numeric(value):
"""Convert income string to numeric value."""
if pd.isna(value):
return None
if isinstance(value, (int, float)):
return float(value)
# Handle string values like "250,000+"
value_str = str(value).strip()
if value_str.endswith("+"):
# For "250,000+", convert to 250000
value_str = value_str[:-1].replace(",", "")
return float(value_str)
# Remove commas and convert to float
value_str = value_str.replace(",", "")
try:
return float(value_str)
except ValueError:
return None
housing_small["median_income_numeric"] = housing_small[income_column].apply(convert_income_to_numeric)
# Drop tracts where median income is not reported to avoid ambiguous values.
before_rows = len(housing_small)
housing_small = housing_small.dropna(subset=["median_income_numeric"])
after_rows = len(housing_small)
print(f"Dropped {before_rows - after_rows} tracts with missing median household income.")
housing_small.head()
Dropped 6 tracts with missing median household income.
| geoid | est_households_median_income_dollars | median_income_numeric | |
|---|---|---|---|
| 0 | 06001400100 | 250,000+ | 250000.0 |
| 1 | 06001400200 | 225880 | 225880.0 |
| 2 | 06001400300 | 157731 | 157731.0 |
| 3 | 06001400400 | 159612 | 159612.0 |
| 4 | 06001400500 | 96250 | 96250.0 |
Joining attributes to tract geometries¶
With a common geoid key in place, we can:
- Join the demographic and housing tables into a single tract-level attribute table.
- Attach those attributes to the tract polygons.
This produces a GeoDataFrame that is ready for both statistical summaries and map-based visualizations.
# Join demographics and housing on the shared geoid key.
# A left join preserves all tracts that appear in the demographic table.
tract_attributes = demographics_small.merge(
housing_small[["geoid", income_column, "median_income_numeric"]],
on="geoid",
how="left",
validate="one_to_one",
)
tract_attributes.head()
| geoid | est_sex_and_age_total_population | est_race_total_population_one_race_white | est_race_total_population_one_race_black_or_african_american | est_race_total_population_one_race_asian | est_hispanic_or_latino_and_race_total_population_hispanic_or_latino_of_any_race | est_households_median_income_dollars | median_income_numeric | |
|---|---|---|---|---|---|---|---|---|
| 0 | 06001400100 | 3094 | 2246 | 137 | 462 | 200 | 250,000+ | 250000.0 |
| 1 | 06001400200 | 2093 | 1423 | 71 | 256 | 196 | 225880 | 225880.0 |
| 2 | 06001400300 | 5727 | 3478 | 524 | 609 | 497 | 157731 | 157731.0 |
| 3 | 06001400400 | 4395 | 2742 | 457 | 422 | 604 | 159612 | 159612.0 |
| 4 | 06001400500 | 3822 | 1773 | 919 | 312 | 557 | 96250 | 96250.0 |
# Attach the combined attributes to tract geometries.
# This creates a GeoDataFrame with both spatial and tabular information.
tracts_with_data = tracts.merge(
tract_attributes,
on="geoid",
how="left",
validate="one_to_one",
)
tracts_with_data[["geoid", "est_sex_and_age_total_population", income_column, "median_income_numeric"]].head()
| geoid | est_sex_and_age_total_population | est_households_median_income_dollars | median_income_numeric | |
|---|---|---|---|---|
| 0 | 06001438201 | 4671 | 107500 | 107500.0 |
| 1 | 06001440333 | 2516 | 127708 | 127708.0 |
| 2 | 06001441524 | 3989 | 250,000+ | 250000.0 |
| 3 | 06001444200 | 6271 | 137773 | 137773.0 |
| 4 | 06001444304 | 6809 | 224511 | 224511.0 |
# Create a simple population density measure using tract area.
# A projected CRS is required so that area is measured in meters rather than degrees.
if tracts_with_data.crs is None or tracts_with_data.crs.is_geographic:
tracts_projected = tracts_with_data.to_crs(epsg=3857)
else:
tracts_projected = tracts_with_data.to_crs(epsg=3857)
tracts_projected["area_sqkm"] = tracts_projected.geometry.area / 1_000_000
# Avoid division by zero by excluding tracts with zero area.
tracts_projected = tracts_projected[tracts_projected["area_sqkm"] > 0].copy()
tracts_projected["pop_density"] = (
tracts_projected["est_sex_and_age_total_population"]
/ tracts_projected["area_sqkm"]
)
tracts_projected[["geoid", "area_sqkm", "pop_density", "median_income_numeric"]].head()
| geoid | area_sqkm | pop_density | median_income_numeric | |
|---|---|---|---|---|
| 0 | 06001438201 | 1.564604 | 2985.419910 | 107500.0 |
| 1 | 06001440333 | 1.564183 | 1608.507178 | 127708.0 |
| 2 | 06001441524 | 3.126476 | 1275.877314 | 250000.0 |
| 3 | 06001444200 | 3.125637 | 2006.310819 | 137773.0 |
| 4 | 06001444304 | 31.243075 | 217.936291 | 224511.0 |
# Summarize the distribution of median household income and population density.
income_summary = tracts_projected["median_income_numeric"].describe()
density_summary = tracts_projected["pop_density"].describe()
print("Median household income summary:\n", income_summary)
print("\nPopulation density (persons per sq km) summary:\n", density_summary)
Median household income summary: count 300.000000 mean 142180.023333 std 55847.475205 min 30833.000000 25% 101490.500000 50% 131883.000000 75% 182496.000000 max 250000.000000 Name: median_income_numeric, dtype: float64 Population density (persons per sq km) summary: count 304.000000 mean 1840.097771 std 1107.378718 min 0.000000 25% 1024.996900 50% 1735.252093 75% 2568.341620 max 5850.215199 Name: pop_density, dtype: float64
Visualizing income and population patterns¶
The joined GeoDataFrame supports both numeric summaries and map-based views. In this section we:
- Plot histograms of median household income and population density.
- Create a static choropleth map of median income by census tract.
These visualizations help reveal how economic conditions and population concentrations vary across the county.
# Histogram of median household income across tracts.
# Use the numeric column for plotting.
fig, ax = plt.subplots(figsize=(8, 4))
tracts_projected["median_income_numeric"].dropna().plot.hist(
bins=30,
ax=ax,
edgecolor="black",
)
ax.set_title("Distribution of Median Household Income by Census Tract")
ax.set_xlabel("Median Household Income (dollars)")
ax.set_ylabel("Number of Tracts")
plt.tight_layout()
# Histogram of population density to highlight areas of high and low concentration.
fig, ax = plt.subplots(figsize=(8, 4))
tracts_projected["pop_density"].dropna().plot.hist(
bins=30,
ax=ax,
edgecolor="black",
)
ax.set_title("Distribution of Population Density by Census Tract")
ax.set_xlabel("Population per square kilometer")
ax.set_ylabel("Number of Tracts")
plt.tight_layout()
# Static choropleth map of median household income by tract.
# This uses the tract polygons to visualize spatial variation.
tracts_for_plot = tracts_projected.to_crs(epsg=4326)
fig, ax = plt.subplots(figsize=(8, 8))
tracts_for_plot.plot(
column="median_income_numeric",
cmap="viridis",
linewidth=0.2,
edgecolor="gray",
legend=True,
ax=ax,
)
ax.set_axis_off()
ax.set_title("Median Household Income by Census Tract")
plt.tight_layout()
Interactive map of income and address density¶
To make the spatial patterns easier to explore, we now create an interactive web map. The map combines:
- A choropleth of median household income by census tract.
- A sampled set of address points to show where activity is concentrated.
The map uses the geoid key and tract geometries already prepared above, so the
workflow from raw tables to interactive visualization is fully documented.
# Reproject tract and address geometries to WGS84 for web mapping.
tracts_wgs84 = tracts_projected.to_crs(epsg=4326)
address_points_wgs84 = address_points.to_crs(epsg=4326)
# Compute a map center from tract centroids.
centroids = tracts_wgs84.geometry.centroid
center_lat = centroids.y.mean()
center_lon = centroids.x.mean()
m = folium.Map(location=[center_lat, center_lon], zoom_start=11, tiles="CartoDB positron")
# Add a choropleth of median household income by tract.
# Use the numeric column for the choropleth
folium.Choropleth(
geo_data=tracts_wgs84.to_json(),
data=tracts_wgs84,
columns=["geoid", "median_income_numeric"],
key_on="feature.properties.geoid",
fill_color="YlGnBu",
fill_opacity=0.7,
line_opacity=0.2,
legend_name="Median Household Income (dollars)",
).add_to(m)
# Sample address points to keep rendering performance reasonable.
sample_size = min(2000, len(address_points_wgs84))
address_sample = address_points_wgs84.sample(sample_size, random_state=RANDOM_SEED)
for _, row in address_sample.iterrows():
folium.CircleMarker(
location=[row.geometry.y, row.geometry.x],
radius=1,
color="black",
weight=0,
fill=True,
fill_opacity=0.4,
).add_to(m)
m
/var/folders/zf/g_wymr6j4pd1lz7svfjqwq9r0000gn/T/ipykernel_48922/601200033.py:7: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation. centroids = tracts_wgs84.geometry.centroid
How to rerun this notebook¶
To reproduce the analysis:
- Create and activate the project environment using the configuration in
environment.yml. - Start JupyterLab or Jupyter Notebook from the project root directory.
- Open
notebooks/project_workflow.ipynb. - Run all cells in order from top to bottom.
All file paths used in this notebook are relative to the repository structure, and the
notebook operates on the preprocessed gold datasets under data/gold. Runtime should
remain under approximately 15 minutes on a typical laptop because the notebook works
from existing outputs rather than recomputing earlier pipeline stages.
Summary¶
This notebook:
- Loads gold-layer demographic, housing, and geometry data for Alameda County census tracts.
- Creates a shared
geoidkey to join attributes to tract geometries. - Constructs simple derived measures such as tract area and population density.
- Visualizes spatial patterns in median household income using both static plots and an interactive Folium map.
A classmate can follow the comments and markdown cells to understand each step of the workflow and reproduce the same figures from the shared project data.